A Local Tree Structure is Not Sufficient for the 
Local Optimality of Message-Passing Decoding in 
Low Density Parity Check Codes 
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Abstract — We address the problein,'Is a local tree structure 
sufficient for ttie iocai optimaiity of message passing algorittim 
in low density parity clieck codes? '.It is sliown that the answer is 
negative. Using this observation, we pinpoint a flaw in the proof 
of Theorem 1 in the paper 'The Capacity of Low-Density Parity- 
Check Codes Under Message-Passing Decoding' by Thomas J. 
Richardson and Riidiger L. Urbanke[l]. We further provide a 
new proof of that theorem based on a different argument. 

I. Introduction 

Message passing algorithm is an efficient and powerful 
algorithm in decoding low density parity check codes. In 
the paper 'The Capacity of Low-Density Parity-Check Codes 
Under Message-Passing Decoding' by Thomas J. Richardson 
and Riidiger L. Urbanke, the authors present an extraordinary 
way of analyzing the performance of low density parity check 
codes by proving the concentration results of the local cycle- 
free structure (namely tree-like structure) and related density 
evolution. We will use the same notations and definitions 
as in [1]. In the proof of Theorem 1 (Monotonicity for 
Physically Degraded Channels), the authors use the following 
observation: 

"... [assuming tree-like neighborhoods Afg^] for a belief- 
propagation decoder, the sign of the message sent along edge e 
in the £th iteration is equal to the the estimate of an maximum- 
likelihood estimator based on the observations in 7V|^[2]". 

This observation is a key argument in establishing Theorem 
1, which was used to validate the error performance analysis 
and threshold determination in the following part of [1]. Beside 
this, the authors also use this observation for the explanation 
and analysis of belief-propagation decoder in [1]. However, 
being kind of surprising, this observation is not true in general. 
We will give a closer look at this problem and discuss its 
consequences. 

II. Message Passing Decoding for a Local Tree 
Structure: A Closer Look 

It is well known that when message passing algorithm is 
applied to a tree-structured tanner graph, the resulting estimate 
for a variable node in the tree structure is equivalent to the 
maximum-likelihood estimation of that variable node based 



on the observations of the variable nodes involved in that 
tanner graph. This result was carried on in [1] to a local 
tree-structured part of a certain tanner graph and it was 
assumed that the same result held true for a tree-structured 
local part conditioned on the local observations. As we will 
see, we do not have this in general. Although the constraints 
specified by the local tree-structured tanner graph are the 
constraints the variable nodes locally involved satisfy, those 
constraints may not be the only constraints that the involved 
variable nodes must satisfy. In other words, there may be 
implicit additional constraints for that local tree-structured 
neighborhood if we infer a complete tanner graph for that 
local part. Those additional constraints may bring cycles to 
the local tree-structured part and undermine the optimality of 
belief-propagation decoder conditioned on local observations. 
Here is the detailed theoretical analysis and examples. 

We consider a LDPC code of length n with parity check 
matrix H, which is of dimension mxn. The codebook for this 
LDPC is denoted by C. Without losing generality, we assume 
this code to be a regular low density parity check code, in 
which each variable node has a degree of dy and each parity 
check node has a degree of dc- Each codeword in the codebook 
is transmitted with equal probability through a binary-input 
memoryless channel and we denote the transmitted codeword 
as x" and the received symbols as y". Now let us consider 
the directed neighborhood of depth 2 of the directed edge 
e~ {v, c), as shown in Figure 1, which is the same as Figure 
2 in [1]. 

Suppose that we have a local observation yx of the received 
vector y", where 2 is the set of indexes i such that yi is 
the observation of the transmitted bit Xi corresponding to a 
variable node involved in the local tree-like neighborhood 
Af§^- Now we are in a position to derive the maximum- 
likelihood estimation for the variable node v based on the 
local observation yx- 
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Obviously, we have the following equations: 



(2) 



(3) 



P(x, = 0,yx)- ^(x")/'(yi|x") (4) 

2;i=0,x"eC 

Since the channel is a binary-input memoryless channel, we 
have 



P(yx|x")-P(yx|xx) 



(5) 



In (|5]l,we denote xx as the set of transmitted bits corre- 
sponding to the index set I. We further denote Cj as the sub- 
codebook for xx- Combining (IS),© and (|5]l,we have 

P(yik^ = 1) cx Y ^'(xx)P(yilxx) (6) 

a;i = l,xz6Ci 

P(yi|a;, = 0)a ^ P(xx)P(yx|xx) (7) 

a:i=0,xzGCx 

The derivation in ([6ll d?) follows from the fact each code- 
word in the sub-codebook Cx induced from the original code 
C has equal probability. 

Without loss of generality, we index the variable nodes of 
the bottom layer in Fig.l as 1, 2, . . . , 10 from the left side to 
the right side. Then the index set T — {v} [J {1, 2, . . . , 10}. 
The belief propagation decoder works perfectly if the code- 
word space Cx is a linear space whose null space is specified 
by the lower 2 parity check nodes in Fig.l [3]. However, the 
linear space of the codebook Cx may be a proper subspace of 
the linear space described in Fig.l. 

For example, suppose we have additional constraints 

xi ® xii © xi2 © xi3 © xi4 © a;i5 = 0; (8) 
a;2 © xii © a;i2 © xia © a;i4 © a;i5 = 0. (9) 

Here xu, . . . , xi^ are some variable nodes that do not appear 
in the local neighborhood A/J^. Although we do not have 
explicit local cycles,we have an implicit constraint xi = X2 
for A/"!^. Since the subcode linear space is a proper space 
of the linear space specified only by the local tree structure, 
the prior probability of the transmitted subcodeword is not 
uniform over the linear space specified by A/|^. In this 
case,the belief-propagation decoder will not necessarily give 
maximum-likelihood estimation of the bit corresponding to the 
variable node v. 



III. Comments on the Proof of Theorem 1 in [1] 

Let us now look at Theorem 1 in [1]: 

"Let W and W' be two given memoryless channels that 
fulfill the required channel symmetry conditions. Assume that 
W is physically degraded with respect to W. For a given 
code and a belief-propagation decoder, let p be the expected 
fraction of incorrect messages passed at the £th decoding 
iteration assuming tree-like neighborhoods and transmission 
over channel W and let p' denote the equivalent quantity for 
transmission over Channel W'. Then p < p'." 

Here is an outline of the original proof of Theorem 1 in [ 1 ] : 

(A) "Since the transmitted bit associated with variable 
node Vi has uniform a priori probability, an ML 
estimator is equal to a maximum a posterior esti- 
mation, which is known to yield the minimum prob- 
ability of error of all estimators based on the same 
observation"; 

(B) The maximum a posterior estimator based on the re- 
ceived observation R by sending a randomly chosen 
codeword through the channel W is superior to the 
maximum a posterior estimator based on the received 
observation R' by sending R through an auxiliary 
channel Q; 

(C) "The claim now follows by observing that for a 
belief-propagation decoder decoder, the sign of the 
message sent along edge e in the Ah decoding 
iteration is equal to the estimate of an ML estimator 
based on the observation in 7V|^ [2]" 

However, two of these claims are inherently flawed. 

(A) With a local tree-like neighborhood, we do not neces- 
sarily have uniform prior probability for the transmit- 
ted bit Vi and the sign of the message sent along edge 
e of belief-propagation decoder is not necessarily 
equal to maximum a posterior estimator even if we 
assume that "the sign of the message sent along 
edge e in the fth decoding iteration is equal to the 
estimate of an ML estimator based on the observation 
in Mf[2T. 

(C) The belief propagation decoder is not necessarily a 
maximum-likelihood estimator based on local ob- 
servations even for a local tree-like neighborhood 
following the results of Section II. 
We now give an example to explain claim (A) by consid- 
ering an LDPC code with = 'i,dc = 5 and a tree-like 
neighborhood of depth 2, which is similar to Fig.l except 
that each check node has 4 descendent variable nodes. So 
in this case, we have I — {v} IJ {1, 2, . . . , 8}. Similar to the 
example given in Section II, we can have the following implicit 
equations even if we do not have cycles in this neighborhood. 



Xl = X2,X3 = X4; 

X5 = xq, X7 = xs; 



(10) 

(11) 



We can easily infer from these additional constraints that 
the variable node v can only take the value from GF(2) 



so that it does not have uniform a prior probability. Thus the 
argument that maximum-likelihood estimator is equivalent to 
the maximum a posterior probability is not valid in general. So 
a maximum-likelihood estimator may not yield the minimum 
probabihty of error and thus may lose the monotonicity of 
probability of error for physically degraded channels.One may 
argue that this case of a redundant bit can not happen in a 
good low density parity check code. However, in the analysis 
of a random low density parity check codes generated from 
a random graph as in [1], one can not exclude this case to 
happen. 

Now comes the question whether Theorem 1 from [1] is 
true. It turns out that we can establish this theorem by relating 
the error probability analysis of belief propagation decoder for 
a tree neighborhood with a belief-propagation decoder for a 
global tree- structured tanner graph. 

Consider a variable node Vi with a local tree neighborhood 
Af§^ and a variable node with a global tree neighborhood 
A/"!^, which means the constraints specified by the tree struc- 
ture Mg^ are the only constraints. The validity of Theorem 1 
can be established through the following arguments. 

{A") For a fixed transmitted all-zero codeword and a 
certain channel realization,the sign of the message 
sent along edge e in the ^th decoding iteration is 
equal to the sign of the maximum-likelihood estimate 
of the bit v^ based on the observation in A/'J^.Thus 
the occurrence of behef-propagation decoding error 
of Vi and t;/ are the same conditioned on the 
same transmitted codeword and the same channel 
realization. 

{B") Due to the channel symmetry conditions, we con- 
clude that the error probabihty of belief propagation 
decoding for Vi is the same as the error probabihty 
of belief propagation decoding for v^, which is equal 
to that of the maximum-likelihood decoding for 
based on the observation within the neighborhood of 

(C") For (',, the maximum-likelihood estimator based on 
the observation in 7V|^ is equivalent to the maximum 
a posterior estimator, for which the estimation of 
based on R has no larger error probability than the 
estimation of based on i?'. Combining {B") and 
(C"), we have Theorem 1 in a local tree neighbor- 
hood whether we have implicit local cycles or not. 

To summarize, the original proof too optimistically evalu- 
ated a behef-propagation decoder to be the best estimator of 
a variable node in a local tree neighborhood, which easily 
implied the superiority of the belief-propagation algorithm for 
non-degraded channel.But this argument fails since a behef 
propagation decoder is not always the best estimator in a local 
tree neighborhood. However,the new proof adds a new element 
of estabhshing the equivalence of the error probability of behef 
propagation decoding in a local tree neighborhood with that 
of belief -propagation decoding in a global tree neighborhood 
through the memoryless channel symmetries.Although the be- 



lief propagation decoder may not be the maximum-likelihood 
estimator nor the maximum-a-posterior estimator, it preserves 
the monotonicity of physical channels by this equivalence. 
This outcome is largely due to the simultaneous potential sub- 
optimality of behef-propagation decoders both in nondegraded 
and in physically degraded channels. 

IV. Conclusions 

In this paper, we point out the subtle fact that a local tree 
structure is not sufficient for the local optimality of message- 
passing algorithm for low density parity check codes.Based on 
this, we discuss some subtle but serious flaws in the original 
proof of Theorem 1 "Monotonicity for Physically Degraded 
Channels" in [1]. We further provide a new proof for Theorem 
1 based on an equivalence of error performance of message- 
passing algorithm for a local tree neighborhood with the error 
performance for a global tree neighborhood. 
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